Integrating Association Rule Mining Algorithms with Relational Database Systems

نویسندگان

  • Jochen Hipp
  • Ulrich Güntzer
  • Udo Grimmer
چکیده

Mining for association rules is one of the fundamental data mining methods. In this paper we describe how to efficiently integrate association rule mining algorithms with relational database systems. From our point of view direct access of the algorithms to the database system is a basic requirement when transferring data mining technology into daily operation. This is especially true in the context of large data warehouses, where exporting the mining data and preparing it outside the database system becomes annoying or even infeasible. The development of our own approach is mainly motivated by shortcomings of current solutions. We investigate the most challenging problems by contrasting the prototypical but somewhat academic association mining scenario from basket analysis with a real-world application. We thoroughly compile the requirements arising from mining an operative data warehouse at DaimlerChrysler. We generalize the requirements and address them by developing our own approach. We explain its basic design and give the details behind our implementation. Based on the warehouse, we evaluate our own approach together with commercial mining solutions. It turns out that regarding runtime and scalability we clearly outperform the commercial tools accessible to us. More important, our new approach supports mining tasks that are not directly addressable by commercial mining solutions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Seamless Integration of Association Rule Mining with Database Systems

The need for Knowledge and Data Discovery Management Systems (KDDMS) that support ad hoc data mining queries has been long recognized. A significant amount of research has gone into building tightly coupled systems that integrate association rule mining with database systems. In this paper, we describe a seamless integration scheme for database queries and association rule discovery using a com...

متن کامل

Efficient Mining for Association Rules with Relational Database Systems

With the tremendous growth of large-scale data repositories, a need for integrating the exploratory techniques of data mining with the capabilities of relational systems to efficiently handle large volumes of data has now risen. In this paper, we look at the performance of the most prevalent association rule mining algorithm Apriori, with IBM’s DB2 Universal Database system. We show that a mult...

متن کامل

Association Rule Mining of Relational Data

Most data of practical relevance are structured in more complex ways than is assumed in traditional data mining algorithms, which are based on a single table. The concept of relations allows for discussing many data structures such as trees and graphs. Relational data have much generality and are of significant importance, as demonstrated by the ubiquity of relational database management system...

متن کامل

Predator-Miner: Ad hoc Mining of Associations Rules within a Database Management System

In this demonstration, we present a prototype system, Predator-Miner, which extends Predator with an relationallike association rule mining operator to support data mining operations. Predator-Miner allows a user to combine association rule mining queries with SQL queries. This approach towards tight integration differs from existing techniques of using user-defined functions (UDFs), stored pro...

متن کامل

Integrating Pattern Mining in Relational Databases

Almost a decade ago, Imielinski and Mannila introduced the notion of Inductive Databases to manage KDD applications just as DBMSs successfully manage business applications. The goal is to follow one of the key DBMS paradigms: building optimizing compilers for ad hoc queries. During the past decade, several researchers proposed extensions to the popular relational query language, SQL, in order t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001